139 research outputs found

    A Deep Network Model for Paraphrase Detection in Short Text Messages

    Full text link
    This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is that existing paraphrase systems perform well when applied on clean texts, but they do not necessarily deliver good performance against noisy texts. Challenges with paraphrase detection on user generated short texts, such as Twitter, include language irregularity and noise. To cope with these challenges, we propose a novel deep neural network-based approach that relies on coarse-grained sentence modeling using a convolutional neural network and a long short-term memory model, combined with a specific fine-grained word-level similarity matching model. Our experimental results show that the proposed approach outperforms existing state-of-the-art approaches on user-generated noisy social media data, such as Twitter texts, and achieves highly competitive performance on a cleaner corpus

    Consumer-side Fairness in Recommender Systems: A Systematic Survey of Methods and Evaluation

    Full text link
    In the current landscape of ever-increasing levels of digitalization, we are facing major challenges pertaining to scalability. Recommender systems have become irreplaceable both for helping users navigate the increasing amounts of data and, conversely, aiding providers in marketing products to interested users. The growing awareness of discrimination in machine learning methods has recently motivated both academia and industry to research how fairness can be ensured in recommender systems. For recommender systems, such issues are well exemplified by occupation recommendation, where biases in historical data may lead to recommender systems relating one gender to lower wages or to the propagation of stereotypes. In particular, consumer-side fairness, which focuses on mitigating discrimination experienced by users of recommender systems, has seen a vast number of diverse approaches for addressing different types of discrimination. The nature of said discrimination depends on the setting and the applied fairness interpretation, of which there are many variations. This survey serves as a systematic overview and discussion of the current research on consumer-side fairness in recommender systems. To that end, a novel taxonomy based on high-level fairness interpretation is proposed and used to categorize the research and their proposed fairness evaluation metrics. Finally, we highlight some suggestions for the future direction of the field.Comment: Draft submitted to Springer (November 2022

    A latent model for collaborative filtering

    Get PDF
    AbstractRecommender systems based on collaborative filtering have received a great deal of interest over the last two decades. In particular, recently proposed methods based on dimensionality reduction techniques and using a symmetrical representation of users and items have shown promising results. Following this line of research, we propose a probabilistic collaborative filtering model that explicitly represents all items and users simultaneously in the model. Experimental results show that the proposed system obtains significantly better results than other collaborative filtering systems (evaluated on the MovieLens data set). Furthermore, the explicit representation of all users and items allows the model to e.g. make group-based recommendations balancing the preferences of the individual users

    A latent model for collaborative filtering

    Get PDF

    Prediction Intervals: Split Normal Mixture from Quality-Driven Deep Ensembles

    Full text link
    Prediction intervals are a machine- and human-interpretable way to represent predictive uncertainty in a regression analysis. In this paper, we present a method for generating prediction intervals along with point estimates from an ensemble of neural networks. We propose a multi-objective loss function fusing quality measures related to prediction intervals and point estimates, and a penalty function, which enforces semantic integrity of the results and stabilizes the training process of the neural networks. The ensembled prediction intervals are aggregated as a split normal mixture accounting for possible multimodality and asymmetricity of the posterior predictive distribution, and resulting in prediction intervals that capture aleatoric and epistemic uncertainty. Our results show that both our quality-driven loss function and our aggregation method contribute to well-calibrated prediction intervals and point estimates
    • …
    corecore